Write a function to check that a binary tree ↴
A binary tree is a tree where every node has two or fewer children. The children are usually called left and right.
class BinaryTreeNode(object):
def __init__(self, value):
self.value = value
self.left = None
self.right = None
This lets us build a structure like this:
That particular example is special because every level of the tree is completely full. There are no "gaps." We call this kind of tree "perfect."
Binary trees have a few interesting properties when they're perfect:
Property 1: the number of total nodes on each "level" doubles as we move down the tree.
Property 2: the number of nodes on the last level is equal to the sum of the number of nodes on all other levels (plus 1). In other words, about half of our nodes are on the last level.
Let's call the number of nodes n, and the height of the tree h. h can also be thought of as the "number of levels."
If we had h, how could we calculate n?
Let's just add up the number of nodes on each level! How many nodes are on each level?
If we zero-index the levels, the number of nodes on the xth level is exactly 2x.
So our total number of nodes is:
n=20+21+22+23+...+2h−1Why only up to 2h−1? Notice that we started counting our levels at 0. So if we have h levels in total, the last level is actually the "h−1"-th level. That means the number of nodes on the last level is 2h−1.
But we can simplify. Property 2 tells us that the number of nodes on the last level is (1 more than) half of the total number of nodes, so we can just take the number of nodes on the last level, multiply it by 2, and subtract 1 to get the number of nodes overall. We know the number of nodes on the last level is 2h−1, So:
n=2h−1∗2−1 n=2h−1∗21−1 n=2h−1+1−1 n=2h−1So that's how we can go from h to n. What about the other direction?
We need to bring the h down from the exponent. That's what logs are for!
First, some quick review. log10(100) simply means, "What power must you raise 10 to in order to get 100?". Which is 2, because 102=100.
We can use logs in algebra to bring variables down from exponents by exploiting the fact that we can simplify log10(102). What power must we raise 10 to in order to get 102? That's easy—it's 2.
So in this case we can take the log2 of both sides:
n=2h−1 n+1=2h log2((n+1))=log2(2h) log2(n+1)=hSo that's the relationship between height and total nodes in a perfect binary tree.
| Balanced | Unbalanced (Worst Case) | |
|---|---|---|
| space | O(n) | O(n) |
| insert | O(lg(n)) | O(n) |
| lookup | O(lg(n)) | O(n) |
| delete | O(lg(n)) | O(n) |
A binary search tree is a binary tree where the nodes are ordered in a specific way. For every node:
Checking if a binary tree is a binary search tree is a favorite question from interviews.
Good performance across the board. Assuming they're balanced, binary search trees are good at lots of operations, even if they're not constant time for anything.
Two binary search trees can store the same values in different ways:
Some trees (like AVL trees or Red-Black trees) rearrange nodes as they're inserted to ensure the tree is always balanced. With these, the worst case complexity for searching, inserting, or deleting is always O(lg(n)), not O(n).
Here's a sample binary tree node class:
class BinaryTreeNode(object):
def __init__(self, value):
self.value = value
self.left = None
self.right = None
def insert_left(self, value):
self.left = BinaryTreeNode(value)
return self.left
def insert_right(self, value):
self.right = BinaryTreeNode(value)
return self.right
Consider this example:
Notice that when you check the blue node against its parent, it seems correct. However, it's greater than the root, so it should be in the root's right subtree. So we see that checking a node against its parent isn't sufficient to prove that it's in the correct spot.
We can do this in O(n) time and O(n) additional space, where n is the number of nodes in our tree. Our additional space is O(lgn) if our tree is balanced.
One way to break the problem down is to come up with a way to confirm that a single node is in a valid place relative to its ancestors. Then if every node passes this test, our whole tree is a valid BST.
What makes a given node "correct" relative to its ancestors in a BST? Two things:
So we could do a walk through our binary tree, keeping track of the ancestors for each node and whether the node should be greater than or less than each of them. If each of these greater-than or less-than relationships holds true for each node, our BST is valid.
The simplest ways to traverse the tree are depth-first ↴
Depth-first search (DFS) is a method for exploring a tree or graph. In a DFS, you go as deep as possible down one path before backing up and trying a different one.
Depth-first search is like walking through a corn maze. You explore one path, hit a dead end, and go back and try a different one.
Here's a how a DFS would traverse this tree, starting with the root:
We'd go down the first path we find until we hit a dead end:
Then we'd do the same thing again—go down a path until we hit a dead end:
And again:
And again:
Until we reach the end.
Depth-first search is often compared with breadth-first search.
Advantages:
Disadvantages
Breadth-first search (BFS) is a method for exploring a tree or graph. In a BFS, you first explore all the nodes one step away, then all the nodes two steps away, etc.
Breadth-first search is like throwing a stone in the center of a pond. The nodes you explore "ripple out" from the starting point.
Here's a how a BFS would traverse this tree, starting with the root:
We'd visit all the immediate children (all the nodes that're one step away from our starting node):
Then we'd move on to all those nodes' children (all the nodes that're two steps away from our starting node):
And so on:
Until we reach the end.
Breadth-first search is often compared with depth-first search.
Advantages:
Disadvantages
Because the tree's breadth can as much as double each time it gets one level deeper, depth-first traversal is likely to be more space-efficient than breadth-first traversal, though they are strictly both O(n) additional space in the worst case. The space savings are obvious if we know our binary tree is balanced—its depth will be O(lgn) and its breadth will be O(n).
But we're not just storing the nodes themselves in memory, we're also storing the value from each ancestor and whether it should be less than or greater than the given node. Each node has O(n) ancestors (O(lgn) for a balanced binary tree), so that gives us O(n2) additional memory cost (O(nlgn) for a balanced binary tree). We can do better.
Let's look at the inequalities we'd need to store for a given node:
Notice that we would end up testing that the blue node is <20, <30, and <50. Of course, <30 and <50 are implied by <20. So instead of storing each ancestor, we can just keep track of a lower_bound and upper_bound that our node's value must fit inside.
We do a depth-first walk through the tree, testing each node for validity as we go. If a node appears in the left subtree of an ancestor, it must be less than that ancestor. If a node appears in the right subtree of an ancestor, it must be greater than that ancestor.
Instead of keeping track of every ancestor to check these inequalities, we just check the largest number it must be greater than (its lower_bound) and the smallest number it must be less than (its upper_bound).
def is_binary_search_tree(root):
# Start at the root, with an arbitrarily low lower bound
# and an arbitrarily high upper bound
node_and_bounds_stack = [(root, -float('inf'), float('inf'))]
# Depth-first traversal
while len(node_and_bounds_stack):
node, lower_bound, upper_bound = node_and_bounds_stack.pop()
# If this node is invalid, we return false right away
if (node.value <= lower_bound) or (node.value >= upper_bound):
return False
if node.left:
# This node must be less than the current node
node_and_bounds_stack.append((node.left, lower_bound, node.value))
if node.right:
# This node must be greater than the current node
node_and_bounds_stack.append((node.right, node.value, upper_bound))
# If none of the nodes were invalid, return true
# (at this point we have checked all nodes)
return True
Instead of allocating a stack ↴
| Worst Case | |
|---|---|
| space | O(n) |
| push | O(1) |
| pop | O(1) |
| peek | O(1) |
A stack stores items in a last-in, first-out (LIFO) order.
Picture a pile of dirty plates in your sink. As you add more plates, you bury the old ones further down. When you take a plate off the top to wash it, you're taking the last plate you put in. "Last in, first out."
You can implement a stack with either a linked list or a dynamic array—they both work pretty well:
| Stack Push | Stack Pop | |
|---|---|---|
| Linked Lists | insert at head | remove at head |
| Dynamic Arrays | append | remove last element |
The call stack is what a program uses to keep track of function calls. The call stack is made up of stack frames—one for each function call.
For instance, say we called a function that rolled two dice and printed the sum.
def roll_die():
return random.randint(1, 6)
def roll_two_and_sum():
total = 0
total += roll_die()
total += roll_die()
print total
roll_two_and_sum()
First, our program calls roll_two_and_sum(). It goes on the call stack:
roll_two_and_sum()
That function calls roll_die(), which gets pushed on to the top of the call stack:
roll_die()
roll_two_and_sum()
Inside of roll_die(), we call random.randint(). Here's what our call stack looks like then:
random.randint()
roll_die()
roll_two_and_sum()
When random.randint() finishes, we return back to roll_die() by removing ("popping") random.randint()'s stack frame.
roll_die()
roll_two_and_sum()
Same thing when roll_die() returns:
roll_two_and_sum()
We're not done yet! roll_two_and_sum() calls roll_die() again:
roll_die()
roll_two_and_sum()
Which calls random.randint() again:
random.randint()
roll_die()
roll_two_and_sum()
random.randint() returns, then roll_die() returns, putting us back in roll_two_and_sum():
roll_two_and_sum()
Which calls print():
print()
roll_two_and_sum()
What actually goes in a function's stack frame?
A stack frame usually stores:
Some of the specifics vary between processor architectures. For instance, AMD64 (64-bit x86) processors pass some arguments in registers and some on the call stack. And, ARM processors (common in phones) store the return address in a special register instead of putting it on the call stack.
Each function call creates its own stack frame, taking up space on the call stack. That's important because it can impact the space complexity of an algorithm. Especially when we use recursion.
For example, if we wanted to multiply all the numbers between 1 and n, we could use this recursive approach:
def product_1_to_n(n):
return 1 if n <= 1 else n * product_1_to_n(n - 1)
What would the call stack look like when n = 10?
First, product_1_to_n() gets called with n = 10:
product_1_to_n() n = 10
This calls product_1_to_n() with n = 9.
product_1_to_n() n = 9
product_1_to_n() n = 10
Which calls product_1_to_n() with n = 8.
product_1_to_n() n = 8
product_1_to_n() n = 9
product_1_to_n() n = 10
And so on until we get to n = 1.
product_1_to_n() n = 1
product_1_to_n() n = 2
product_1_to_n() n = 3
product_1_to_n() n = 4
product_1_to_n() n = 5
product_1_to_n() n = 6
product_1_to_n() n = 7
product_1_to_n() n = 8
product_1_to_n() n = 9
product_1_to_n() n = 10
Look at the size of all those stack frames! The entire call stack takes up O(n) space. That's right—we have an O(n) space cost even though our function itself doesn't create any data structures!
What if we'd used an iterative approach instead of a recursive one?
def product_1_to_n(n):
# We assume n >= 1
result = 1
for num in range(1, n + 1):
result *= num
return result
This version takes a constant amount of space. At the beginning of the loop, the call stack looks like this:
product_1_to_n() n = 10, result = 1, num = 1
As we iterate through the loop, the local variables change, but we stay in the same stack frame because we don't call any other functions.
product_1_to_n() n = 10, result = 2, num = 2
product_1_to_n() n = 10, result = 6, num = 3
product_1_to_n() n = 10, result = 24, num = 4
In general, even though the compiler or interpreter will take care of managing the call stack for you, it's important to consider the depth of the call stack when analyzing the space complexity of an algorithm.
Be especially careful with recursive functions! They can end up building huge call stacks.
What happens if we run out of space? It's a stack overflow! In Python 2.7, you'll get a RecursionError.
If the very last thing a function does is call another function, then its stack frame might not be needed any more. The function could free up its stack frame before doing its final call, saving space.
This is called tail call optimization (TCO). If a recursive function is optimized with TCO, then it may not end up with a big call stack.
In general, most languages don't provide TCO. Scheme is one of the few languages that guarantee tail call optimization. Some Ruby, C, and Javascript implementations may do it. Python and Java decidedly don't.
def is_binary_search_tree(root,
lower_bound=-float('inf'),
upper_bound=float('inf')):
if not root:
return True
if (root.value >= upper_bound or root.value <= lower_bound):
return False
return (is_binary_search_tree(root.left, lower_bound, root.value)
and is_binary_search_tree(root.right, root.value, upper_bound))
Checking if an in-order traversal of the tree is sorted is a great answer too, especially if you're able to implement it without storing a full list of nodes.
O(n) time and O(n) space.
The time cost is easy: for valid binary search trees, we’ll have to check all n nodes.
Space is a little more complicated. Because we’re doing a depth first search, node_and_bounds_stack will hold at most d nodes where d is the depth of the tree (the number of levels in the tree from the root node down to the lowest node). So we could say our space cost is O(d).
But we can also relate d to n. In a balanced tree, d is log2n. And the more unbalanced the tree gets, the closer d gets to n.
In the worst case, the tree is a straight line of right children from the root where every node in that line also has a left child. The traversal will walk down the line of right children, adding a new left child to the stack at each step. When the traversal hits the rightmost node, the stack will hold half of the n total nodes in the tree. Half is O(n), so our worst case space cost is O(n).
What if the input tree has duplicate values?
What if -float('inf') or float('inf') appear in the input tree?
We could think of this as a greedy ↴
A greedy algorithm builds up a solution by choosing the option that looks the best at every step.
Say you're a cashier and need to give someone 67 cents (US) using as few coins as possible. How would you do it?
Whenever picking which coin to use, you'd take the highest-value coin you could. A quarter, another quarter, then a dime, a nickel, and finally two pennies. That's a greedy algorithm, because you're always greedily choosing the coin that covers the biggest portion of the remaining amount.
Some other places where a greedy algorithm gets you the best solution:
Careful: sometimes a greedy algorithm doesn't give you an optimal solution:
Validating that a greedy strategy always gets the best answer is tricky. Either prove that the answer produced by the greedy algorithm is as good as an optimal answer, or run through a rigorous set of test cases to convince your interviewer (and yourself) that its correct.
We could also think of this as a sort of "divide and conquer" approach. The idea in general behind divide and conquer is to break the problem down into two or more subproblems, solve them, and then use that solution to solve the original problem.
In this case, we're dividing the problem into subproblems by saying, "This tree is a valid binary search tree if the left subtree is valid and the right subtree is valid." This is more apparent in the recursive formulation of the answer above.
Of course, it's just fine that our approach could be thought of as greedy or could be thought of as divide and conquer. It can be both. The point here isn't to create strict categorizations so we can debate whether or not something "counts" as divide and conquer.
Instead, the point is to recognize the underlying patterns behind algorithms, so we can get better at thinking through problems.
Sometimes we'll have to kinda smoosh together two or more different patterns to get our answer.
Wanna review this one again later? Or do you feel like you got it all?
Wanna review this one again later? Or do you feel like you got it all?
Mark as done Pin for review laterWant more coding interview help?
Check out for more advice, guides, and practice questions.
Reset editor
Powered by qualified.io